Spell-Checking based on Syllabification and Character-level Graphs for a Peruvian Agglutinative Language

نویسندگان

  • Carlo Alva
  • Arturo Oncevay-Marcos
چکیده

There are several native languages in Peru which are mostly agglutinative. These languages are transmitted from generation to generation mainly in oral form, causing different forms of writing across different communities. For this reason, there are recent efforts to standardize the spelling in the written texts, and it would be beneficial to support these tasks with an automatic tool such as a spell-checker. In this way, this spelling corrector is being developed based on two steps: an automatic rule-based syllabification method and a character-level graph to detect the degree of error in a misspelled word. The experiments were realized on Shipibo-konibo, a highly agglutinative and Amazonian language, and the results obtained have been promising in a dataset built for the purpose.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Google to Create a More Accurate and Easily-Extensible Spell Corrector

Spell checkers are now a common, integrated part of many commercial and freely available word processing programs. Agglutinative languages (such as Hungarian and Finnish) pose a separate problem, as there are many different " correct " forms for any given word. Due to the seemingly infinite number of possible words, the limited scope of a dictionary (provided with most spell-checking software) ...

متن کامل

Morphology-Aware Spell-Checking Dictionary for Esperanto

The article describes the process of constructing a spell checker for the Esperanto language and its implementation as a dictionary (i.e. an affix file and a word list) for the Hunspell spell-checking engine. In comparison to existing solutions, the chosen approach takes note of morphologically complex words, which are common in Esperanto due to its agglutinative nature, and applies a set of ru...

متن کامل

Chinese Spell Checking Based on Noisy Channel Model

Chinese spell checking is an important component of many NLP applications, including word processors, search engines, and automatic essay rating. Compared to English, Chinese has no word boundaries and there are various Chinese input methods that cause different kinds of typos, so it is more difficult to develop spell checkers for Chinese. In this paper, we introduce a novel method for correcti...

متن کامل

Text Segmentation for Chinese Spell Checking

Chinese spell checking is different from its counterparts for Western languages because Chinese words in texts are not separated by spaces. Chinese spell checking in this article refers to how to identify the misuse of characters in text composition. In other words, it is error correction at the word level rather than at the character level. Before Chinese sentences are spell checked, the text ...

متن کامل

Raslan 2009

The article describes the process of constructing a spell checker for the Esperanto language and its implementation as a dictionary (i.e. an affix file and a word list) for the Hunspell spell-checking engine. In comparison to existing solutions, the chosen approach takes note of morphologically complex words, which are common in Esperanto due to its agglutinative nature, and applies a set of ru...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017